Bootstrapping language acquisition.

نویسندگان

  • Omri Abend
  • Tom Kwiatkowski
  • Nathaniel J Smith
  • Sharon Goldwater
  • Mark Steedman
چکیده

The semantic bootstrapping hypothesis proposes that children acquire their native language through exposure to sentences of the language paired with structured representations of their meaning, whose component substructures can be associated with words and syntactic structures used to express these concepts. The child's task is then to learn a language-specific grammar and lexicon based on (probably contextually ambiguous, possibly somewhat noisy) pairs of sentences and their meaning representations (logical forms). Starting from these assumptions, we develop a Bayesian probabilistic account of semantically bootstrapped first-language acquisition in the child, based on techniques from computational parsing and interpretation of unrestricted text. Our learner jointly models (a) word learning: the mapping between components of the given sentential meaning and lexical words (or phrases) of the language, and (b) syntax learning: the projection of lexical elements onto sentences by universal construction-free syntactic rules. Using an incremental learning algorithm, we apply the model to a dataset of real syntactically complex child-directed utterances and (pseudo) logical forms, the latter including contextually plausible but irrelevant distractors. Taking the Eve section of the CHILDES corpus as input, the model simulates several well-documented phenomena from the developmental literature. In particular, the model exhibits syntactic bootstrapping effects (in which previously learned constructions facilitate the learning of novel words), sudden jumps in learning without explicit parameter setting, acceleration of word-learning (the "vocabulary spurt"), an initial bias favoring the learning of nouns over verbs, and one-shot learning of words and their meanings. The learner thus demonstrates how statistical learning over structured representations can provide a unified account for these seemingly disparate phenomena.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The “Globularization Hypothesis” of the Language-ready Brain as a Developmental Frame for Prosodic Bootstrapping Theories of Language Acquisition

In recent research (Boeckx and Benítez-Burraco, 2014a,b) have advanced the hypothesis that our species-specific language-ready brain should be understood as the outcome of developmental changes that occurred in our species after the split from Neanderthals-Denisovans, which resulted in a more globular braincase configuration in comparison to our closest relatives, who had elongated endocasts. A...

متن کامل

The sound symbolism bootstrapping hypothesis for language acquisition and language evolution.

Sound symbolism is a non-arbitrary relationship between speech sounds and meaning. We review evidence that, contrary to the traditional view in linguistics, sound symbolism is an important design feature of language, which affects online processing of language, and most importantly, language acquisition. We propose the sound symbolism bootstrapping hypothesis, claiming that (i) pre-verbal infan...

متن کامل

The Development of Human 1

The Development of Human 2 2 Abstract This chapter discusses a uniquely human learning mechanism—bootstrapping-whereby external symbolic systems (especially language) enable the creation of new, internal representational resources. The process is described in general terms and is also illustrated with a specific case: The acquisition of concepts for positive integers (1, 2, 3, etc.). First, the...

متن کامل

A Linguistic Approach to Terminological Context Clustering

Clustering mechanisms are important for many NLP tasks such as knowledge acquisition, term extraction and disambiguation, machine translation and ontology building. Our approach fo-cuses on the clustering of terminological contexts as a bootstrapping method for such tasks. While most techniques involve statistical methods , we combine syntactic and semantic information about terms and their con...

متن کامل

An XML-Based Bootstrapping Method for Pattern Acquisition

Extensible Markup Language (XML) has been widely used as a middleware because of its flexibility. Fixed domain is one of the bottlenecks of Information Extraction (IE) technologies. In this paper we present a XML-based domain-adaptable bootstrapping method of pattern acquisition, which focuses on minimizing the cost of domain migration. The approach starts from a seed corpus with some seed patt...

متن کامل

Advances in the computational study of language acquisition.

This paper provides a tutorial introduction to computational studies of how children learn their native languages. Its aim is to make recent advances accessible to the broader research community, and to place them in the context of current theoretical issues. The first section locates computational studies and behavioral studies within a common theoretical framework. The next two sections revie...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Cognition

دوره 164  شماره 

صفحات  -

تاریخ انتشار 2017